{
 "cells": [
  {
   "attachments": {},
   "cell_type": "markdown",
   "metadata": {},
   "source": [
    "# How to align open LLMs in 2025 with DPO & Hugging Face\n",
    "\n",
    "Large Language Models (LLMs) continued their important role in 2024, with several major developments completely outperforming previous models. The focus continued to more smaller, more powerful models from companies like Meta, Qwen, or Google. These models not only became more powerful, but also more efficient.\n",
    "\n",
    "Part of the reason is the continued scaling of post-training methods and datasets. Post-training methods like [Direct Preference Optimization (DPO)](https://huggingface.co/papers/2305.18290), [Proximal Policy Optimization (PPO)](https://huggingface.co/papers/2203.02155), and [Group Preference Optimization (GRPO)](https://huggingface.co/papers/2310.11523) have been used to align models with human preferences, boosting performance and being more aligned with human preferences.\n",
    "\n",
    "Continuing from [How to fine-tune open LLMs in 2025 with Hugging Face](https://www.philschmid.de/fine-tune-llms-in-2025), this guide focuses on aligning models using Direct Preference Optimization (DPO). We'll build upon our previously fine-tuned model from the [Fine-Tune LLMs in 2025](https://www.philschmid.de/fine-tune-llms-in-2025) guide and improve it further through preference learning.\n",
    "\n",
    "Before we start, let's take a look at the [Direct Preference Optimization (DPO)](https://huggingface.co/papers/2305.18290) paper and understand how it works.\n",
    "\n",
    "**What is DPO?**\n",
    "\n",
    "[Direct Preference Optimization (DPO)](https://huggingface.co/papers/2305.18290) is a simplified approach to align language models with human preferences. Unlike traditional RLHF methods that require training a separate reward model and using complex PPO optimization, DPO treats the alignment problem as a classification task on preference data. This makes it more stable, efficient, and computationally lightweight compared to alternatives. \n",
    "\n",
    "- **Simplicity**: DPO eliminates the need for a separate reward model and complex RL optimization, making it easier to implement and debug\n",
    "- **Stability**: By avoiding the instabilities of PPO training, DPO provides more reliable convergence\n",
    "- **Efficiency**: The direct optimization approach requires less compute and fewer hyperparameters than traditional RLHF\n",
    "- **Performance**: Despite its simplicity, DPO achieves comparable results\n",
    "\n",
    "In this guide, we'll use offline DPO with on-Policy data. Offline DPO is the more compute-efficient, meaning we don't need a Reward model during training in memory. However, we will be staying on-policy. We'll first generate samples from the Supervised Fine-Tuned (SFT) model we trained in the previous guide and score them with a rule-based reward model that checks answer correctness. These scores will then be used to create preference pairs (a \"preferred\" and a \"rejected\" response) for DPO training.\n",
    "\n",
    "If you are going to adjust the example to your own use case, which cannot be verified with a rule-based reward model, you can use a generic reward model like [nicolinho/QRM-Llama3.1-8B-v2](https://huggingface.co/nicolinho/QRM-Llama3.1-8B-v2), which ranks in the top 10 on reward bench or LLM as judge based on your own criteria.\n",
    "\n",
    "You will learn how to:\n",
    "1. [Setup the development environment](#1-setup-the-development-environment)\n",
    "2. [Create on-policy preference dataset from model outputs](#2-create-on-policy-preference-dataset)\n",
    "3. [Align the model using DPO and the Hugging Face `DPOTrainer`](#3-align-the-model-using-dpo)\n",
    "4. [Test and evaluate the aligned model](#4-test-and-evaluate)\n",
    "\n",
    "_Note: This guide is designed for consumer GPUs (24GB+) like the NVIDIA RTX 4090/5090 or A10G, but can be adapted for larger systems._"
   ]
  },
  {
   "cell_type": "markdown",
   "metadata": {},
   "source": [
    "## 1. Setup development environment\n",
    "\n",
    "Our first step is to install Hugging Face Libraries and Pytorch, vllm, and trl, transformers and datasets. If you haven't heard of trl yet, don't worry. It is a new library on top of transformers and datasets, which makes it easier to fine-tune, rlhf, align open LLMs. \n"
   ]
  },
  {
   "cell_type": "code",
   "execution_count": null,
   "metadata": {},
   "outputs": [],
   "source": [
    "# Install Pytorch & other libraries, make sure to match your GPU driver version\n",
    "%pip install \"torch==2.5.1\" vllm tensorboard  \"setuptools<71.0.0\" openai \"lm-eval[api]==0.4.5\"  --index-url https://download.pytorch.org/whl/cu121\n",
    "\n",
    "# Install flash-attn\n",
    "%pip install flash-attn \n",
    "\n",
    "# Install Hugging Face libraries\n",
    "%pip install  --upgrade \\\n",
    "  \"transformers==4.48.1\" \\\n",
    "  \"datasets==3.1.0\" \\\n",
    "  \"accelerate==1.3.0\" \\\n",
    "  \"bitsandbytes==0.45.0\" \\\n",
    "  \"peft==0.14.0\" \\\n",
    "  \"hf-transfer==0.1.9\" \\\n",
    "  \"trl==0.13.0\""
   ]
  },
  {
   "cell_type": "markdown",
   "metadata": {},
   "source": [
    "_Note: you may need to restart the kernel to use updated packages._\n",
    "\n",
    "We will use the [Hugging Face Hub](https://huggingface.co/models) as a remote model versioning service. This means we will automatically push our model, logs and information to the Hub during training. You must register on the [Hugging Face](https://huggingface.co/join) for this. After you have an account, we will use the `login` util from the `huggingface_hub` package to log into our account and store our token (access key) on the disk.\n",
    "\n"
   ]
  },
  {
   "cell_type": "code",
   "execution_count": null,
   "metadata": {},
   "outputs": [],
   "source": [
    "from huggingface_hub import login\n",
    "\n",
    "login(token=\"\", add_to_git_credential=True) # ADD YOUR TOKEN HERE"
   ]
  },
  {
   "cell_type": "markdown",
   "metadata": {},
   "source": [
    "## 2. Create on-policy preference dataset from model outputs\n",
    "\n",
    "We are going to generate a preference dataset from our SFT model rather than using a different model and off-policy data. On-policy data is generated from the model being trained, which means the preference pairs directly reflect the model's current capabilities and output distribution. This makes the training more relevant and effective.\n",
    "\n",
    "Studies have shown that leveraging on-policy preference data leads to better performance compared to off-policy approaches. [REF 1](https://arxiv.org/abs/2404.14367), [REF 2](https://arxiv.org/abs/2403.10160). \n",
    "\n",
    "As we are building on our previous fine-tuned model we want to furhter improve the performance on MATH related tasks. We will use the 1500 samples from the [philschmid/DMath](https://huggingface.co/datasets/philschmid/DMath). DMath (Diverse Math Word Problems) is a collection of 10K high-quality grade school-level math word problems for the paper [\"It Ain’t Over: A Multi-aspect Diverse Math Word Problem Dataset\"](https://aclanthology.org/2023.emnlp-main.927.pdf). Each solution will be scored by a rule-based reward model that checks answer correctness using a regex. (Thats a very simple and naive approach, better would to additionally use specific format extraction, e.g. \"The Answer is: [ANSWER]\" or additional LLM Judge). If the solution is correct, it will be labeled as preferred, otherwise as rejected. Based on the score we will create difference preference pairs.\n",
    "\n",
    "I prepared [create_preference_dataset.py](./scripts/dpo/create_preference_dataset.py) script to generate the dataset. The dataset will be saved to the Hugging Face Hub. We are going to generate 4 solutions for each input hoping to have at least one correct solution and one incorrect solution. If we don't have a pair we will skip the input.\n",
    "\n",
    "```python\n",
    "python scripts/dpo/create_preference_dataset.py --dataset_id philschmid/DMath --sample_size 5000 --generation_model_name_or_path philschmid/llama-3-1-8b-math-orca-spectrum-10k-ep1 --num_solutions 4 --batch_size 16 \n",
    "```\n",
    "\n",
    "_Note: You can skip this step and used the generated preference dataset from me [philschmid/philschmid-llama-3-1-8b-math-orca-spectr-philschmid-DMath-candidates](https://huggingface.co/philschmid/philschmid-llama-3-1-8b-math-orca-spectr-philschmid-DMath-candidates). It includes 1.9k prefence pairs._"
   ]
  },
  {
   "attachments": {},
   "cell_type": "markdown",
   "metadata": {},
   "source": [
    "## 3. Align the model using DPO and the Hugging Face DPOTrainer\n",
    "\n",
    "TRL supports the DPO through a dedicated [DPOTrainer](https://huggingface.co/docs/trl/dpo_trainer) for aligning LLMs from preference data, as described in [Direct Preference Optimization: Your Language Model is Secretly a Reward Model](https://huggingface.co/papers/2305.18290). The `DPOTrainer` is a subclass of the `Trainer` from the `transformers` library and supports all the same features, including logging, checkpointing, distributed training, and parameter efficient fine-tuning (PEFT). [Direct Preference Optimization (DPO)](https://huggingface.co/papers/2305.18290) aligns LLMs with human preferences using 'prompt', 'chosen', and 'rejected' columns, where each 'prompt' is paired with a 'chosen' response that aligns with human preferences and a 'rejected' response that does not.\n",
    "\n",
    "We prepared a [run_dpo.py](./scripts/dpo/run_dpo.py) scripts, which supports providing a yaml configuration file. This allows you to easily change the model, dataset, hyperparameters, and other settings. This is done by using the `TrlParser`, which parses the yaml file and converts it into the `TrainingArguments` arguments. \n",
    "\n",
    "Based on the [Alignment Handbook](https://github.com/huggingface/alignment-handbook) we know that we need to use a ~10-100x smaller learning rate for DPO compared to SFT. In our example we reduce the learning rate from 2e-4 (SFT) to 5e-6 (DPO). Another important parameter is the `beta` parameter, which is used to control the strength of the alignment, typically something in the range of 0.1 to 0.5. A higher beta means less divergence from the initial reference model or the text generations are very similar in terms of their probability distributions. "
   ]
  },
  {
   "cell_type": "markdown",
   "metadata": {},
   "source": [
    "The script is kept very simple and is easy to understand. This should help you understand, customize and extend the script for your own use case. We define `dataclasses` for our arguments. Every argument can then be provided either via the command line or by providing a yaml configuration file. That way we have better type safety and intellisense support.\n",
    "\n",
    "```python\n",
    "# ....\n",
    "\n",
    "@dataclass\n",
    "class ScriptArguments:\n",
    "    dataset_id_or_path: str\n",
    "    ...\n",
    "# ....\n",
    "```\n",
    "\n",
    "The training script is separated by `#######` blocks for the different parts of the script. The main training function: \n",
    "1. Logs all hyperperparameters\n",
    "2. Loads the dataset and Tokenizer from Hugging Face Hub or local disk\n",
    "3. Prepares the DPO dataset correctly using the `prompt`, `chosen` and `rejected` column\n",
    "4. Loads the policy model and/or reference model\n",
    "5. Instantiate DPO trainer and starts the training loop (optionally continue training from a checkpoint)\n",
    "6. Saves the model and optionally pushes it to the Hugging Face Hub\n",
    "\n",
    "Below is an example recipe of how we train [Llama 8B model with DPO and Q-LoRA](./receipes/dpo-llama-3-1-8b-qlora.yaml). \n",
    "\n",
    "```yaml\n",
    "# Model arguments\n",
    "model_name_or_path: philschmid/llama-3-1-8b-math-orca-spectrum-10k-ep1\n",
    "tokenizer_name_or_path: philschmid/llama-3-1-8b-math-orca-spectrum-10k-ep1\n",
    "model_revision: main\n",
    "torch_dtype: bfloat16\n",
    "attn_implementation: flash_attention_2\n",
    "use_liger: false\n",
    "bf16: true\n",
    "tf32: true\n",
    "output_dir: runs/dpo-llama-3-1-8b-math-ep3\n",
    "\n",
    "# Dataset arguments\n",
    "dataset_id_or_path: philschmid/philschmid-llama-3-1-8b-math-orca-spectr-philschmid-DMath-candidates\n",
    "\n",
    "# LoRA arguments\n",
    "use_peft: true\n",
    "load_in_4bit: true\n",
    "lora_target_modules: \"all-linear\"\n",
    "# important as we need to train the special tokens for the chat template of llama \n",
    "lora_modules_to_save: [\"lm_head\", \"embed_tokens\"] # you might need to change this for qwen or other models\n",
    "lora_r: 16\n",
    "lora_alpha: 16\n",
    "\n",
    "# Training arguments\n",
    "beta: 0.1\n",
    "max_length: 1536\n",
    "max_prompt_length: 768\n",
    "loss_type: sigmoid # default loss, alternatives: https://huggingface.co/docs/trl/dpo_trainer#loss-functions\n",
    "num_train_epochs: 3\n",
    "per_device_train_batch_size: 1 \n",
    "gradient_accumulation_steps: 8\n",
    "gradient_checkpointing: true\n",
    "gradient_checkpointing_kwargs:\n",
    "  use_reentrant: false\n",
    "learning_rate: 5.0e-6 \n",
    "lr_scheduler_type: constant\n",
    "warmup_ratio: 0.03\n",
    "\n",
    "# Logging arguments\n",
    "logging_strategy: steps\n",
    "logging_steps: 5\n",
    "report_to:\n",
    "- tensorboard\n",
    "save_strategy: \"epoch\"\n",
    "seed: 42\n",
    "\n",
    "# Hugging Face Hub \n",
    "push_to_hub: true\n",
    "  # hub_model_id: llama-3-1-8b-math-orca-qlora-10k-ep1 # if not defined same as output_dir\n",
    "hub_strategy: every_save\n",
    "```\n",
    "\n",
    "This config can be used for single-GPU training (~24GB GPU Memory), if you have more memory available you can increase the `per_device_train_batch_size` and for multi-GPU training with DeepSpeed (see Appendix for full \n",
    "\n",
    "_Note: I ran the script on 1x H100 with a batch size of 8 and 8 gradient accumulation steps. This took ~1 hours to complete._command). "
   ]
  },
  {
   "cell_type": "code",
   "execution_count": null,
   "metadata": {},
   "outputs": [],
   "source": [
    "!python scripts/dpo/run_dpo.py --config receipes/dpo-llama-3-1-8b-qlora.yaml"
   ]
  },
  {
   "cell_type": "markdown",
   "metadata": {},
   "source": [
    "_Note: During the training we want to minimize loss and grow reward/margins metrics. Keep an eye on the reward/margins metrics, if they are not growing you might need to increase the beta parameter or adjust the learning_rate._\n"
   ]
  },
  {
   "cell_type": "markdown",
   "metadata": {},
   "source": [
    "Using Q-LoRA only saves the trained adapter weights. If you want to use the model as standalone model, e.g. for inference you might want to merge the adapter and base model. This can be done using the following command:"
   ]
  },
  {
   "cell_type": "code",
   "execution_count": null,
   "metadata": {},
   "outputs": [],
   "source": [
    "!python scripts/merge_adapter_weights.py --peft_model_id runs/dpo-llama-3-1-8b-math-ep3 --push_to_hub True --repository_id dpo-llama-3-1-8b-math-ep3-merged"
   ]
  },
  {
   "attachments": {},
   "cell_type": "markdown",
   "metadata": {},
   "source": [
    "## 4. Test and evaluate the aligned model\n",
    "\n",
    "After the training is done we want to evaluate and test our model. Similar to our SFT model, we will evaluate the model on [GSM8K](https://huggingface.co/datasets/openai/gsm8k) dataset to see if it improved performance. GSM8K (Grade School Math 8K) is a dataset of 8.5K high quality linguistically diverse grade school math word problems. The dataset was created to support the task of question answering on basic mathematical problems that require multi-step reasoning.\n",
    "\n",
    "Evaluating Generative AI models is not a trivial task since 1 input can have multiple correct outputs. If you want to learn more about evaluating generative models, check out:\n",
    "* [Evaluate LLMs and RAG a practical example using Langchain and Hugging Face](https://www.philschmid.de/evaluate-llm).\n",
    "* [Evaluate LLMs using Evaluation Harness and Hugging Face TGI/vLLM](https://www.philschmid.de/evaluate-llms-with-lm-eval-and-tgi-vllm)\n",
    "* [LLM Evaluation doesn't need to be complicated](https://www.philschmid.de/llm-evaluation)\n",
    "* [Evaluating Open LLMs with MixEval: The Closest Benchmark to LMSYS Chatbot Arena](https://www.philschmid.de/evaluate-llm-mixeval)\n",
    "\n",
    "We are going to use [Evaluation Harness](https://github.com/EleutherAI/lm-evaluation-harness) an open-source framework to evaluate language models on a wide range of tasks and benchmarks. The frameworks support evaluating models behind OpenAI compatible API endpoints, those can be locally or remotely. This super helpful as we can evaluate our model in the same environment we will use for production. \n",
    "\n",
    "\n",
    "We are going to use [Text Generation Inference (TGI)](https://github.com/huggingface/text-generation-inference) for testing and deploying our model. TGI is a purpose-built solution for deploying and serving Large Language Models (LLMs). TGI enables high-performance text generation using Tensor Parallelism and continous batching. If you are or want to use vLLM you can check the Appendix on how to start the inference server.\n",
    "\n",
    "_Note: Make sure that you have enough GPU memory to run the container. Restart kernel to remove all allocated GPU memory from the notebook._ \n",
    "\n",
    "We will start the on 1 GPU detached. Meaning we can can continue to use the notebook while the container is running. If you have more GPUs you can change the `--gpus` and `--num-shard` flags to the number of GPUs. "
   ]
  },
  {
   "cell_type": "code",
   "execution_count": null,
   "metadata": {},
   "outputs": [],
   "source": [
    "%%bash\n",
    "\n",
    "num_gpus=1\n",
    "model_id=philschmid/dpo-llama-3-1-8b-math-ep3-merged # replace with your model id\n",
    "\n",
    "docker run --name tgi --gpus ${num_gpus} -d -ti -p 8080:80 --shm-size=2GB \\\n",
    "  -e HF_TOKEN=$(cat ~/.cache/huggingface/token) \\\n",
    "  ghcr.io/huggingface/text-generation-inference:3.0.1 \\\n",
    "  --model-id ${model_id} \\\n",
    "  --num-shard ${num_gpus}\n"
   ]
  },
  {
   "cell_type": "markdown",
   "metadata": {},
   "source": [
    "Our container will now start in the background and download the model from Hugging Face Hub. We can check the logs to see the progress with `docker logs -f tgi`.\n",
    "\n",
    "Once our container is running we can send requests using the `openai` or `huggingface_hub` sdk. Here we ll use the `openai` sdk to send a request to our inference server. If you don't have the `openai` sdk installed you can install it using `pip install openai`."
   ]
  },
  {
   "cell_type": "code",
   "execution_count": null,
   "metadata": {},
   "outputs": [],
   "source": [
    "from openai import OpenAI\n",
    "\n",
    "# create client \n",
    "client = OpenAI(base_url=\"http://localhost:8080/v1\",api_key=\"-\")\n",
    "\n",
    "system_message = \"\"\"Solve the given high school math problem by providing a clear explanation of each step leading to the final solution.\n",
    "\n",
    "Provide a detailed breakdown of your calculations, beginning with an explanation of the problem and describing how you derive each formula, value, or conclusion. Use logical steps that build upon one another, to arrive at the final answer in a systematic manner.\n",
    "\n",
    "# Steps\n",
    "\n",
    "1. **Understand the Problem**: Restate the given math problem and clearly identify the main question and any important given values.\n",
    "2. **Set Up**: Identify the key formulas or concepts that could help solve the problem (e.g., algebraic manipulation, geometry formulas, trigonometric identities).\n",
    "3. **Solve Step-by-Step**: Iteratively progress through each step of the math problem, justifying why each consecutive operation brings you closer to the solution.\n",
    "4. **Double Check**: If applicable, double check the work for accuracy and sense, and mention potential alternative approaches if any.\n",
    "5. **Final Answer**: Provide the numerical or algebraic solution clearly, accompanied by appropriate units if relevant.\n",
    "\n",
    "# Notes\n",
    "\n",
    "- Always clearly define any variable or term used.\n",
    "- Wherever applicable, include unit conversions or context to explain why each formula or step has been chosen.\n",
    "- Assume the level of mathematics is suitable for high school, and avoid overly advanced math techniques unless they are common at that level.\n",
    "- Return the final in an extra line. Staring with \"The Answer is: [ANSWER]\"\n",
    "\n",
    "# Examples\n",
    "\"\"\"\n",
    "\n",
    "messages = [\n",
    "    {\"role\": \"system\", \"content\": system_message},\n",
    "    # {\"role\": \"user\", \"content\": \"If you converted $140 to 158760 Korean Won, how much is $1 in Korean Won?\"},\n",
    "    {\"role\": \"user\", \"content\": \"Q: Henry and 3 of his friends order 7 pizzas for lunch. Each pizza is cut into 8 slices. If Henry and his friends want to share the pizzas equally, how many slices can each of them have?\\nA:\"},\n",
    "    # {\"role\": \"user\", \"content\": \"The rectangular-shaped cell phone is 9 centimeters (cm) wide and 46 centimeters (cm) in circumference. Find the vertical length of the cell phone?\"},\n",
    "]\n",
    "expected_answer = \"14\"\n",
    "\n",
    "\n",
    "# Take a random sample from the dataset and remove the last message and send it to the model\n",
    "\n",
    "response = client.chat.completions.create(\n",
    "    model=\"philschmid/dpo-llama-3-1-8b-math-ep3-merged\",\n",
    "    messages=messages,\n",
    "    stream=False, # no streaming\n",
    "    max_tokens=1024,\n",
    "    temperature=1.0)\n",
    "response = response.choices[0].message.content\n",
    "\n",
    "# Print results\n",
    "print(f\"Query:\\n{messages[1]['content']}\")\n",
    "print(f\"Original Answer:\\n{expected_answer}\")\n",
    "print(f\"Generated Answer:\\n{response}\")\n"
   ]
  },
  {
   "cell_type": "markdown",
   "metadata": {},
   "source": [
    "Awesome that looks great! Now we can evaluate our model with the [Evaluation Harness](https://github.com/EleutherAI/lm-evaluation-harness).\n",
    "\n",
    "_Note: Make sure to change the model id to your fine-tuned model._"
   ]
  },
  {
   "cell_type": "code",
   "execution_count": null,
   "metadata": {},
   "outputs": [],
   "source": [
    "!lm_eval --model local-chat-completions \\\n",
    "  --tasks gsm8k_cot \\\n",
    "  --model_args model=philschmid/dpo-llama-3-1-8b-math-ep3-merged,base_url=http://localhost:8080/v1/chat/completions,num_concurrent=8,max_retries=10,tokenized_requests=False,timeout=180,max_length=4096 \\\n",
    "  --apply_chat_template \\\n",
    "  --fewshot_as_multiturn"
   ]
  },
  {
   "cell_type": "markdown",
   "metadata": {},
   "source": [
    "Wow, 59% accuracy, thats a 5% improvement from our SFT model, using only ~2k preference pairs for 3 epochs. That shows that our script and config is working correctly. \n",
    "\n",
    "_Note: You might be able to achieve better results with more data, more epochs or tuning the hyperparameters (beta, learning rate, batch size, etc.). I ran some ablations on multi-gpu training and full training with DeepSpeed (see Appendix for full command) and the best results was 62% accuracy._"
   ]
  },
  {
   "cell_type": "code",
   "execution_count": null,
   "metadata": {},
   "outputs": [],
   "source": [
    "!docker stop tgi\n",
    "!docker rm tgi"
   ]
  },
  {
   "cell_type": "markdown",
   "metadata": {},
   "source": [
    "# Appendix\n",
    "\n",
    "_Note: Make sure to install deepspeed and accelerate before running the commands. `pip install deepspeed==0.15.4`_\n",
    "\n",
    "\n",
    "## Distributed Training\n",
    "\n",
    "```bash\n",
    "ACCELERATE_LOG_LEVEL=info accelerate launch --num_processes 4 --config_file configs/accelerate_configs/deepspeed_zero3.yaml scripts/dpo/run_dpo.py --config receipes/dpo-llama-3-1-8b.yaml\n",
    "```"
   ]
  },
  {
   "cell_type": "markdown",
   "metadata": {},
   "source": []
  }
 ],
 "metadata": {
  "kernelspec": {
   "display_name": "pytorch",
   "language": "python",
   "name": "python3"
  },
  "language_info": {
   "codemirror_mode": {
    "name": "ipython",
    "version": 3
   },
   "file_extension": ".py",
   "mimetype": "text/x-python",
   "name": "python",
   "nbconvert_exporter": "python",
   "pygments_lexer": "ipython3",
   "version": "3.11.11"
  },
  "orig_nbformat": 4,
  "vscode": {
   "interpreter": {
    "hash": "2d58e898dde0263bc564c6968b04150abacfd33eed9b19aaa8e45c040360e146"
   }
  }
 },
 "nbformat": 4,
 "nbformat_minor": 2
}
